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Abstract — Electroencephalogram (EEG) based brain-computer 
interfaces (BCI) may provide a means of communication for those 
affected by severe paralysis. However, the relatively low informa- 
tion transfer rates (ITR) of these systems, currently limited to 1 
bit/sec, present a serious obstacle to their widespread adoption 
in both clinical and non-clinical applications. Here, we report 
on the development of a novel noninvasive BCI communication 
system that achieves ITRs that are severalfold higher than those 
previously reported with similar systems. Using only 8 EEG 
channels, 6 healthy subjects with little to no prior BCI experience 
selected characters from a virtual keyboard with sustained, error- 
free, online ITRs in excess of 3 bit/sec. By factoring in the time 
spent to notify the subjects of their selection, practical, error- 
free typing rates as high as 12.75 character/min were achieved, 
which allowed subjects to correctly type a 44-character sentence 
in less than 3.5 minutes. We hypothesize that ITRs can be further 
improved by optimizing the parameters of the interface, while 
practical typing rates can be significantly improved by shortening 
the selection notification time. These results provide compelling 
evidence that the ITR limit of noninvasive BCIs has not yet been 
reached and that further investigation into this matter is both 
justified and necessary. 

Index Terms — Brain-computer interfaces, P300 speller, infor- 
mation transfer rate. 



I. Introduction 

Noninvasive electroenceplialogram (EEG) based brain- 
computer interfaces (BCIs) may provide a means of commu- 
nication for those affected by locked-in syndrome [|T] or other 
forms of severe paralysis. These systems rely on predictable 
EEG patterns that are translated into control signals for real- 
time operation of external devices Q- Thus, individuals with 
severe paralysis may benefit from BCI technology by bypass- 
ing the disrupted motor pathways and operating prostheses 
directly from the brain. Paralyzed individuals have success- 
fully used EEG-based BCIs to operate computer cursors ||3], 
m, a virtual reality wheelchair [|5], a virtual reality avatar ||6|, 
and functional electrical stimulation devices Q, IS), JQ). 

One of the most robust EEG-based BCI communication 
systems is the P300 speller, developed originally by Farwell 
and Donchin ITOl . This communication protocol relies on 
the P300 evoked potential ifTTIl — a positive deflection in EEG 
signals, observed predominantly over the parietal lobe, which 
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occurs ^300 msec after the presentation of an infrequent, task- 
relevant stimulus. In a large majority of P300 spelling systems, 
the presentation of task-relevant stimuli conforms to the visual 
oddball paradigm [12], where the subject is instructed to pay 
attention to a rare stimulus in a random sequence of stimuli 
presented on a computer screen. The subject's intentions can 
then be decoded in real time by detecting the presence of the 
P300 potential that coincides with the illumination of letters 
from a virtual keyboard. It was hypothesized in llTOI that such 
a system could achieve information transfer rates (ITRs) as 
high as 0.2 bit/sec (or 2.3 characters/min). 

While subsequent studies (e.g. ifTsll . lfT4l ) have managed 
to optimize the original BCI spelling system and thus sig- 
nificantly improve its performance, the achieved ITRs are 
still relatively modest and fall well below those of commu- 
nication and/or control systems that rely on residual motor 
function, such as eye movements ifTsl . Whether used in 
spelling, computer cursor movement, or other applications, it 
is generally accepted that the ITR limit of EEG-based BCIs 
is ^1 bit/sec j2j, which remains a major obstacle to their 
adoption in both clinical and non-clinical applications. 

In this article, we report on the development of a novel 
EEG-based BCI communication protocol, where subjects with 
little to no prior BCI experience were able to achieve sus- 
tained, error-free ITRs in excess of 3 bit/sec in a real-time 
typing test. These bit rates are severalfold higher than those 
previously reported by P300 spellers and EEG-based BCIs. We 
hypothesize that optimization of the interface parameters and 
user training may further increase the communication speed 
limit of these systems, which may have a significant impact 
on their adoption in both clinical and non-clinical applications. 

II. Methods 

A. Study Protocol 

Six able-bodied individuals (see Table |l] for demographic 
data) were recruited for this study. All subjects had normal 
or normal corrected vision and no cognitive or neurological 
impairments. The study was approved by the University of 
California Irvine Institutional Review Board. 

Each participant completed three experimental sessions per- 
formed on three different days over the course of 1-3 weeks. 
Within each daily session, a subject performed BCI spelling 
experiments at three different interface speeds (see Table HIl). 
For each speed, a short training procedure was performed, fol- 
lowed by 1-3 online spelling sessions. A detailed description 
of these procedures is given in Section III-CI 

Online sessions were performed in a free spelling mode ||T]. 
Specifically, the subjects were asked to correctly spell the 
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TABLE I 

Demographic data of the study participants. 



Subject 


Gender 


Age 


Prior BCI experience 


Native speaker 


A 


F 


23 


Yes (3 hours) 


Yes 


B 


M 


40 


Yes (10 hours) 


No 


C 


M 


29 


Yes (1 hour) 


Yes 


D 


M 


22 


No 


Yes 


E 


M 


24 


No 


Yes 


F 


F 


56 


Yes (10 hours) 


Yes 



TABLE II 

Breakdown of the study protocol. 



Day 


Interface speed 


1 


slow medium fast 


2 


medium fast slow 


3 


fast slow medium 



following 44-character sentence: THE QUICK BROWN FOX 
JUMPS OVER THE LAZY DOG*. This includes spaces and 
the symbol * at the end, which serves to exit the interface. 
Note that each letter of the English alphabet appears at 
least once in the sentence. In the case of a typing error, 
the subjects used backspace to delete erroneously selected 
characters, and then proceeded with the correct sequence of 
letters. All subjects were able to type the benchmark sentence 
free of errors and exit the interface (see Section [IT]for details). 
Also, the subjects had no trouble memorizing the sentence 
and were able to track their spelling progress on the computer 
screen. The total daily involvement per subject was 2-3 hours. 

B. Data Acquisition 

Each subject was seated ^0.9 m from a computer monitor 
that displayed a 6x7 matrix of characters (see Fig.[l)- An EEG 
cap (Compumedics USA, Charlotte, NC) with 19 sintered Ag- 
AgCl electrodes, arranged according to the 10-20 International 
Standard, was used for EEG recording. Conductive gel was 
applied to a subset of eight electrodes at the following loca- 
tions: C3, Cz, C4, P3, Pz, P4, Ol, and 02 (see Fig.©. Ear clip 
electrodes, Al and A2, were linked and used as a reference 
electrode. However, if the 30-Hz ear-to-ear impedance was 
above 10 kO, only the ear with the lower impedance was used 
as a reference. The impedances between the reference and each 
of the eight electrodes were <5 kil. The signals were then 
amplified (gain: 5,000) and band-pass filtered (1-35 Hz) using 
8 single-channel EEG bioampUfiers (Biopac Systems, Goleta, 
CA), and were digitized (sampling rate: 200 Hz, resolution: 16 
bits) by the MP150 acquisition system (Biopac Systems). The 
data acquisition and experimental protocols were controlled 
by custom-made Matlab (Mathworks, Natick, MA) scripts. 
EEG data recorded during training procedures were analyzed 
offline, while those recorded during online sessions were 
analyzed in real-time (details in Section Ill-Cb . 

C. Experimental Protocol 

For each choice of interface speed (see Table a training 
session was performed first followed by an online spelling 



A B c D E F G 

THE QUICaC BROWN FOX JUMPS O 

HI J K L M N 

P Q R S T U 

V w X Y z , • 

1 2 3 4 5 6 7 



< 8 9 7 ! > 



Fig. 1. A screenshot of the character matrix. The illuminated letters are bold- 
faced and highlighted (pink). The prompt (yellow) shows the typing progress. 

test. A training session started by instructing the subject to 
focus on a specific character for 30 sec. Within this time 
frame, the characters were illuminated randomly in groups 
of 6 in a block randomized fashion, i.e. after a single cycle 
consisting of seven illuminations, all 42 characters have been 
illuminated exactly once. The details of the randomization 
algorithm are described in Appendix |A] Its main function 
is to group characters according to the frequency in which 
they appear in the English language so that groups containing 
more frequent letters can be illuminated earlier in the cycle. 
The benefits of this algorithm will later be elaborated upon. 
Also, note that as only one of the groups contains the desired 
character, the ratio of oddball and non-oddball stimuli in the 
training sessions is 1:6. Upon completing the cycle, the groups 
were re-randomized, and the whole procedure was repeated for 
a total of 30 sec. The frequency of illumination was controlled 
by the inter-trial interval (ITI) (see Fig. |2), with a duty cycle 
of 60% (ton/ITI = 0.6). After 30 sec, a short break ensued, 
during which the subject was instructed to focus on another 
character, and the whole procedure was repeated for a total 
of 10 characters. Finally, the whole training session including 
breaks lasted ^6.5 min. 




Fig. 2. The timing diagram of the experimental protocol. 

In response to each illumination, ta — 400 msec of EEG 
data were acquired and stored for offline analysis. Throughout 
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this article, a single 400-msec data segment is referred to as 
a trial. The total number of trials in each training procedure 
depended on the choice of ITI. In particular, the slow, medium 
and fast interface speeds (Table corresponded to an ITI of 
400, 240 and 160 msec, respectively. Note that for the medium 
and fast speeds, the successive trials partially overlapped (as 
illustrated in Fig. |2]l. 

During online sessions, the illumination of characters was 
controlled as follows. In the initial stage (referred to as stage 
1), the computer illuminated characters in groups of six (see 
Fig. [Til in the same manner as done in training sessions. In 
response to each illumination, an EEG trial was processed 
in real time and classified (see Section III-Db . As long as the 
trials were classified as non-oddball, the interface kept cycling 
through stage 1. Once an oddball trial was detected, the BCI 
computer transitioned to stage 2, where individual characters 
from the selected group were illuminated. Note that faster 
transition to stage 2 was facilitated by illuminating the more 
frequent characters earlier in the stage- 1 cycle. 

Similar to stage 1, the illumination order in stage 2 was 
based on the character's relative frequencies (see Appendix IaIi. 
Once an individual character was selected (by classifying its 
corresponding trial as oddball), the interface highlighted the 
selected character using a green rectangular background and 
paused for 3 sec to let the subject know of the decision. 
In addition, the selected character was copied to the typing 
prompt (see Fig. [TJ so that the typing progress could be 
tracked. The BCI computer then transitioned back to stage 1, 
and the whole process was repeated. An exception to this rule 
occurred when the BCI computer found a limited number (<5) 
of dictionary completions to the partially spelled word. In this 
case, the completion characters were illuminated individually 
in a random order Repeated failure (over three cycles) to select 
a character in this single-character illumination stage resulted 
in the BCI computer defaulting to stage 1 . 

In addition to the aforementioned single-trial selection 
method, the BCI kept track of the highest letter count. More 
specifically, this method integrates the evidence, expressed 
as the posterior probability (see Section III-DI i of individual 
characters, to find a character that has had the highest inte- 
grated probability for over 10 consecutive trials to immediately 
classify and select it as oddball, thus bypassing stage 2. This 
mechanism is useful when the evidence based on a single trial 
is not sufficiently strong to warrant the character selection. 
Upon character selection, the counter is reset to zero and all 
the posterior probabiUties are reset to 1/42. 

D. Signal Processing and Classification 

Within each trial (see Fig.|2]i, the first = 100 msec of data 
was presumed to contain no useful information due to the lag 
in visual information processing |[T6l . and so the trials were 
effectively truncated to 300 msec. Subsequently, trials from the 
training procedure, represented by 8 x 60 matrices (8 channels, 
0.3 X 200 samples), were reshaped into 480-dimensional (480- 
D) vectors. To allow accurate estimation of data statistics 
under both oddball and non-oddball conditions, and to facili- 
tate subsequent classification of trials into these two classes. 



the dimension of data was reduced using a combination of 
classwise principal component analysis (CPCA) ifTTl . ifTSl and 
approximate information discriminant analysis (AIDA) |]T9]. 

For binary pattern recognition problems, CPCA projects 
high-dimensional data onto a pair of subspaces locally adapted 
to individual classes. Due to its nonlinear (piecewise linear) 
nature, CPCA is well-suited for problems where traditional 
linear dimensionality reduction techniques may be inadequate. 
In addition, unlike PCA and other nonlinear dimensional- 
ity reduction methods [[20j, CPCA is a supervised learning 
technique, and it therefore takes advantage of the known 
class information. In the present study, implementing CPCA 
with default parameters ifTTl typically resulted in dimension 
reduction from 480 to 20-30. A detailed description of this 
technique can be found in IfTTl. 

To enhance class separability while further reducing the 
dimension of data, AIDA [19! was used. AIDA represents 
an approximation of an information-theoretic technique ||2TI . 
which extracts features by maximizing the mutual information 
between the class labels and data. However, unlike com- 
putationally expensive information-theoretic methods II2TI . 
122], AIDA retains the computational simplicity characteristics 
of linear, second-order techniques, such as linear discrimi- 
nant analysis |[23l . Specifically, the feature extraction matrix, 
Tayda, is found through eigen-decomposition. In this study, 1- 
D features were extracted in this manner, i.e. Taida G M"'^^, 
where m is the subspace dimension found by CPCA (to = 
20-30). A detailed account of AIDA can be found in |fT9l . 

Once 1-D features were extracted, a linear Bayesian classi- 
fier was designed in the feature domain: 



where p{o \ F*) and p{e \ F*) are the posterior probabilities of 
oddball and non-oddball classes given the observed feature F*, 
respectively. The equation ^ reads: "classify F* as oddball 
if p{o\F*)/p{e\F*) > 9, and vice versa." The threshold 
6 ~ Afa/Aom represents the false alarm in which case the 
classifier ([T]i minimizes the total risk function ||231 . Il24l (see 
Appendix iBl for details). 

The performance of the above feature extraction and classifi- 
cation methods was tested on the training-procedure data using 
stratified 10-fold cross-validation (CV) Briefly, the EEG 
trials were randomly separated into 10 groups (folds) while 
approximately preserving the 1:6 oddball to non-oddball ratio 
in each group. The data from 9 folds were then used to train the 
parameters of CPCA, AIDA, and the Bayesian classifier ([T]i, 
and the data from the remaining fold were transformed into 
the feature domain and classified assuming an equal cost of 
omissions and false alarms (i.e. 9 — 1). The procedure was 
then repeated until all 10 folds were exhausted, each time 
designating a different fold for classification. The number of 
misclassified trials was used to calculate the probabilities of 
omission and false alarm errors. To estimate the standard 
deviation on these errors, the 10-fold CV procedure was 
repeated 10 times, each time re-randomizing the grouping of 
trials into folds. The parameters of CPCA, AIDA, and the 
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Bayesian classifier were then saved for the online procedure. 

In the online sessions, real-time acquired trials were re- 
shaped in the same way as the training data, and subse- 
quently transformed into features using CPCA and AIDA 
transformations. The 1-D features were then classified with 
the Bayesian classifier For stage 1 online classification, 
the same threshold, 6 — 1, was used as in the training session. 
For stage 2 online classification, or when the interface was in 
a single-character illumination mode, the threshold changed to 
6 = 0.5. This choice reflects our empirical finding that P300 
weakens when characters are illuminated individually. 

E. Information Transfer Rate Calculations 

The described BCI system can be modeled as a binary 
communication channel (see Appendix whose inputs are 
user intentions and outputs are the decoded intentions. The 
amount of information per transmission is given by the mutual 
information between inputs and outputs 1261 . i.e. 

/(in, out) = iJ(out) - iJ(out|in) (2) 

which measures the reduction in the output uncertainty by 
providing the input. The explicit formula for calculating (|2]i 
is given in Appendix ICl The ITR can then be calculated as: 
ITR = B /(in, out), where B is the number of transmissions 
(character illuminations) per unit of time. 

For online performances, the total time T to correctly type 
the benchmark sentence (44 characters) was recorded by the 
BCI computer This time included the 3 sec pause after each 
selection that allowed subjects to be notified of their selection, 
track the typing progress, and visually locate the next desired 
character This was true regardless of whether a correct or 
incorrect selection was made. In addition, the subjects were 
required to correct the incorrect selections by backspacing. 
While in this case, the selection of < (backspace) represents 
an intended action, backspaces were «of counted as correct 
selections since their purpose is to merely rectify previously 
committed error(s). As stringent as these requirements are, we 
believe that they set a standard for the definition of ITR that is 
completely immune to bit rate manipulations. More formally, 
practical error-free ITR is defined as; 

ITR = ^ log2 1^1 (3) 

where Nc is the number of correctly typed characters and |^| 
is the size of the alphabet (Nc = 44, \A\ = 42 in this study). 

III. Results 

The data from each training procedure were used for offline 
estimation of feature extraction and classification parameters, 
as outlined in Section lTl-DI Event-related potential (ERP) anal- 
ysis (obtained by averaging oddball and non-oddball trials), 
consistently revealed that subjects utilized both N200, mostly 
visible on the occipital lobe ^190 msec post-stimulus (see 
Fig.O, and P300 which was present on all channels ^285-300 
msec post-stimulus. This is consistent with findings reported 
by other groups, e.g. |[T3l . ||27] . In addition, cross-correlation 
analysis revealed that the prominent positive potential seen in 



the fronto-parietal areas (most notably at electrodes Cz and 
C4) approximately 190 msec post-stimulus was likely coming 
from the N200 source, albeit recorded from the opposite side 
of the dipole, and hence the phase reversal ||281 . 




Fig. 3. Event-related potentials of oddball (red) and non-oddball (blue) trials 
for Subject B, collected at the slow interface speed. The error bars represent 
the standard error of mean. Each panel is 18 X 300 msec, with the grid 
lines con'esponding to 200 and 300 msec post-stimulus. 

The offline performances estimated through 10-fold CV 
and expressed as the probability of correct classification are 
presented in Table |III] Classification rates as high as 97.4% 
were achieved, and all subjects performed significantly above 
the 85.71% threshold, determined to be the chance-level 
performance of the Bayesian classifier (see Appendix Ici. The 
number of trials varied depending on the ITI: 750 (ITI=400 
msec), 1250 (ITI=240 msec), and 1870 (ITI=160 msec). To 
rule out the effect of the sample size on the achieved classifier 
performances, the feature extraction and classifier training 
procedures were repeated in the case of ITI=250 msec and 
ITI=160 msec by randomly sub-selecting 750 trials while 
preserving the 1:6 oddball-to-non-oddball ratio. The classifica- 
tion rates were not significantly different than those using all 
available trials, so the differences in the offline performances 
observed across ITIs are likely caused by other factors, such 
as the dependence of the P300 amplitude on ITI time 1291 . 
Finally, the offline ITRs were calculated using Eq. Based 
on these performances, all subjects were expected to have 
purposeful control of the BCI in the online mode (see below). 

In the online sessions, the performance of each subject 
was determined by the total time taken to correctly type the 
44-character sentence (see Table II V) . All subjects achieved 
their best results at the high interface speed and were able to 
complete the task within a 3.45^.51 min time window. The 
subjects' average performances also demonstrate that they pre- 
ferred the high interface speed (note that Subject A performed 
n = 5 online sessions for the 400 ms speed, while all other 
averages were based on n = 6 sessions). This was true despite 
the highest offline ITRs being achieved at the slow interface 
speed (cf. Table Ullb . and it indicates that the speed-accuracy 



5 



TABLE III 

Offline performances of subiects as assessed through I 0-fold 
CV. Rows show mean performance (top), best performance 

(MIDDLE), AND ITR OF BEST PERFORMANCE (BOTTOM). 



Subject 


400 


ITI (msec) 
240 


160 


A 


95.7±1.5% 

97.0% 

0.403* 


94.1±1.1% 

94.9% 

0.308 


93.4±1.6% 

94.4% 

0.293 


B 


95.2±1.9% 

97.4% 

0.427** 


93.9±2.5% 

96.2% 

0.365 


94.3±2.9% 

96.6% 

0.385 


C 


93.9±2.5% 

96.6% 

0.385 


93.7±2.8% 

96.9% 

0.397* 


93.5±1.9% 

95.5% 

0.335 


D 


91.2±0.8% 

92.1% 

0.196 


90.5±0.2% 

90.7% 

0.150 


91.1±1.0% 

92.2% 

0.202* 


E 


91.2±1.5% 

92.3% 

0.211 


92.8±2.2% 

95.1% 

0.325* 


90.5±2.3% 

92.4% 

0.221 


F 


94.2±0.9% 

95.3% 

0.327* 


93.3±0.9% 

94.2% 

0.279 


93.8±0.3% 

94.0% 

0.273 



* marks personal best and ** overall best. 



trade off is well exploited using an ITI=160 msec. In addition, 
the practical, error-free ITRs were calculated using (|3), and 
they reached values as high as 1.146 bit/sec. This bit rate 
corresponds to correctly typing 12.75 character/min (see sup- 
plementary movie at http://www.youtube.com/user/UCIBCI). 
It should be noted that out of the 207.1 sec to complete the 
sentence, only 78.1 sec were spent on letter selection while 129 
sec were spent on post-selection pauses. A similar breakdown 
applies to other subjects, where post-selection time constituted 
more than 60% of the total time. 

TABLE IV 

Online performances (in sec, includes the 3 sec pause after 

LETTER selection) ACROSS SUBIECTS AND DAYS. MEAN PERFORMANCE 
(TOP ROW), BEST PERFORMANCE AND DAY AT WHICH IT WAS ACHIEVED 
(BOTTOM ROW), AND INFORMATION TRANSFER RATES OF THE BEST 
ONLINE SESSION (LAST COLUMN) ARE GIVEN. 



Subject 


400 


ITI (msec) 
240 


160 


ITR (bit/sec) 


A 


324.6±28.1 
302.9 (3) 


328.5±37.0 
289.4 (2) 


302.5±20.3 
271.7* (1) 


0.873* 


B 


337.9±27.7 
299.8 (3) 


304.1 ±48.1 
248.9 (3) 


265.0±52.5 
214.3* (2) 


1.107* 


C 


395.8±67.3 
301.8 (3) 


305.4±44.8 
254.6 (3) 


239.5±37.6 
207.1** (2) 


1.146** 


D 


495.4±61.3 
447.0 (1) 


380.7±77.0 
275.0 (1) 


306.4±86.6 
237.8* (2) 


0.998* 


E 


558.3±104.4 
471.9 (1) 


397.6±83.2 
323.5 (2) 


471.5±162.9 
299.7* (1) 


0.792* 


F 


465.4±118.3 
354.7 (3) 


346.2±53.4 
263.4 (3) 


254.2±14.4 
233.2* (2) 


1.017* 



* marks personal best and ** overall best. 



Before comparing the sustained, error-free ITRs achieved in 
this study to those of other EEG-based BCI systems, we make 



the following observations: (i) reported ITRs often exclude 
or simply ignore post-selection time (3 sec in the present 
study) from calculations 1301 , ||3T1 , ||32l , and (ii) reported 
ITRs are rarely, if ever, calculated in an error-free fashion, i.e. 
the subjects are not required to correct spelling errors before 
proceeding. Table IV] shows a comparison of the peak character 

TABLE V 

Comparison of the best achieved information transfer rates 
for several eeg-based bci studies. 



Study ITR (bit/trial) Trial Frequency (trial/sec) ITR (bit/sec) 



A 0.363 


5.81 


2.109 


B 


1 0.514 


5.82 


2.992 


C 


: 0.522 


5.82 


3.038 


E 


1 0.422 


5.82 


2.455 


E 


; 0.302 


5.85 


1.766 


F 


0.415 


5.86 


2.434 


m 


1 0.039 


5.71 


0.224 


M 


i 0.129 


8 


1.028 


M 


2.373 


0.526 


1.249 




1 0.859 


0.463 


0.398 



selection ITRs achieved in the present study to those derived 
from other EEG-based BCI studies. The present-study ITRs 
were obtained from (|3]l by subtracting the total post-selection 
time from the personal best times reported in Table |IV] While 
these rates were nominally achieved at ITI=160 msec (or 
6.25 trial/sec), the values reported in the middle column are 
somewhat lower due to real-time processing demands. For the 
study in lfT3l . the performance of the best subject (Subject A) 
was determined based on 11 illumination sequences (the study 
uses 15 sequences, but a 0% error rate appears to be achieved 
after 11 sequences). This corresponds to correctly spelling at 
a rate of 23.1 sec (11 x 2.1) per character, which given a 
36-character matrix and according to (|3), yields an ITR of 
0.224 bit/sec. Also, this study used an ITI=175 msec, which 
is equivalent to 5.71 trial/sec. Similar to |[T3l . this corresponds 
to correctly spelling at a rate of 6.2 sec (3 x 2.06) per character 
Given the same 36-character matrix and according to (|3), this 
subject's result yields an ITR of 0.836 bit/sec. Since an ITI 
similar to that of llT3l was used in this study (^172 msec), an 
equivalent trial frequency and ITR in bits/trial of 5.82 trials/sec 
and 0.144 bit/trial, respectively, can easily be calculated. 

Bit rates as high as 61.70 bit/min were reported in llT4l 
(Subject 14), from which the value in Table |V] immediately 
follows. The illumination frequency of 8 trials/sec readily 
follows from an ITI=125 msec used in the study. The study 
in 13 was not concerned with a BCI spelling task, rather it 
reported on BCI-controlled cursor movements to a series of 8 
screen targets. The best performance (Subject D) corresponded 
to accuracy of 92%, which using the formula yields 2.373 
bit/trial. With the average duration of a trial being 1.9 sec, the 
ITR of 1.249 bit/trial follows readily. Finally, the study in ^ 
determined the speed and accuracy of two different flashing 
paradigms: single letter display and row-column display of 
flashing characters in a 6 x 6 matrix. It was determined 
that the single letter display paradigm was able to achieve 
higher communication speeds, spelling approximately 1 letter 
every 13 sec with a 95% accuracy across all 5 participants. 
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Thus, with this typing speed, subjects were able to spell a 
42-character sentence in 546 sec, which corresponds to a 
0.398 bit/sec ITR using equation ([3]) while assuming error-free 
performances and ignoring post-selection notification times. 

IV. Discussion 

A. Performance 

The above results dispel the common assumption regarding 
ITRs achievable by EEG-based BCIs 0, (H. In particular, 
our system allows characters to be selected in an error-free 
fashion with ITRs in excess of 3 bit/sec (cf. Table [VTi. which is 
three times higher than the best bit rates achieved with similar 
EEG spelling systems ||l4L and nearly three times higher than 
those achieved with 2-D cursor control [[3]. 

The superior performance of our system can be attributed 
to several factors. Firstly, it is a truly single-trial system, i.e. 
it rehably classifies oddball and non-oddball stimuli after a 
single illumination. Other systems, such as those based on the 
original Farwell-Donchin paradigm filOi . require repeated (up 
to 20) presentations of an oddball stimulus before a selection 
is made |fT3l . lfT4l . Similar requirements are imposed in the 
so-called checkerboard paradigm [[T4l . Also, the ability to 
classify oddball and non-oddball stimuli on a single-trial basis 
with rates as high as 97.4% (see Table Ullb is facilitated by a 
combination of techniques IfTTl , IfTSl , p9| briefly described 
in Section III-DI The most distinct feature of this method 
is that it can efficiently handle high-dimensional (480-D in 
the present study) spatio-temporal data without resorting to 
heuristic strategies such as subsampling EEG signals lfT3l . Il35l 
or constructing a feature set by addition/deletion of individual 
attributes lfT3l . lfT4l . 1351 . This method has also been used to 
successfully classify other types of neurophysiologic data such 
as electrocorticograms (ECoG) f36\, and in other types of BCI 
applications, such as asynchronous control of a virtual reality 
avatar ||6l, 1371 . hand orthosis ll38l . and functional electrical 
stimulator Q. Secondly, biasing the illumination order of 
characters according to their frequencies (see Appendix |A) 
significantly reduces the time the subject spends waiting for 
the desired character to get illuminated. For example, based 
on 100,000 Monte Carlo trials, we estimated that the 12 most 
frequent characters (see Fig. H) are on average illuminated 
within the first two groups. For comparison, if the charac- 
ters had been grouped in a uniformly random fashion, both 
frequent (e.g. E) and infrequent (e.g. Z) characters would 
have been on average found in the fourth group. Thus, the 
above sampling procedure exploits the relatively low entropy 
of the English language |26:|, which in turn facilitates faster 
character selection. Similarly, the partial word completion 
feature prompts users to first select those letters that represent 
dictionary-defined completions, thereby bypassing stage 1 and 
yielding significant time savings. 

B. Information Transfer Rates 

The discrepancy in per-trial ITRs between the offline (cf. Ta- 
bleHnli and online performances (cf. Table [V]| can be explained 
by two factors. Firstly, the effect of feedback and subsequent 
user-interface interaction (e.g. excitement, frustration), specific 



to online sessions, cannot be accounted for with offline data. 
Secondly, the 1 :6 oddball-to-non-oddball ratio observed in the 
training procedure may be significantly higher in the online 
procedure. Since the order of character illumination depends 
on their frequencies, the desired characters are likely to be 
illuminated early in the cycle (see Appendix |A|i, and users 
are likely to select them before all 7 groups are illuminated, 
thereby disturbing the 1:6 ratio. To underscore this point, 
offline ITRs corresponding to a high interface speed (see 
Table Ullt were recalculated according to (|2]l with the assumed 
1:3 oddball-to-non-oddball ratio, and values between 0.274 
and 0.524 bit/trial were obtained, which are remarkably close 
to the ITRs achieved online (see Table [V|. Therefore, the 
mutual information formula (|2]i provides a reasonably accurate 
estimate of ITRs achievable online. 

Based on the above, it follows that the formula ll39l : 

/(in, out) = log2 C+Pc log2 Pc + Pe logs ( c^^) 

frequently used to express ITRs in BCI studies ||2], gOl, BTl . 
is not adequate for BCIs based on the oddball paradigm. First 
note that for a two-class system (C = 2), the expression ^ 
reduces to /(in, out) = 1 + Pc^og2Pc + Pe^og2Pe, which 
represents the capacity (the maximum achievable ITR) of a 
binary symmetric channel l26l . To achieve this upper limit, in 
addition to being symmetric, the BCI communication channel 
must maintain equal prior probabilities of oddball and non- 
oddball trials, i.e. p{o) = p{e). This, however, contradicts 
the very definition of the oddball paradigm, where by design 
we must have p{o) ^ /'(e)- Similarly, for a chance level 
performance, we have Pc = pie) and p^ = p{o) (see 
Appendixlct. and so it follows from (assuming the standard 
p{o) to p{e) ratio) that /(in, out) = 0.408 bit/trial, which 
presents an obvious contradiction. On the other hand, our 
analysis correctly demonstrates (Appendix Q that subjects 
with performances pc < p{e) are not able to spell, since 
/(in, out) = 0. Depending on the exact oddball-to-non- 
oddball ratio, offline performances substantially greater than 
50% may be necessary for successful online spelling. 

For an asymmetric communication channel (Appendix |Cj, 
the probabilities p^ and pc cannot be unequivocally linked to 
the mutual information, i.e. the confusion matrix probabilities 
must be used explicitly. If these are not available, a lower 
bound based on the Fano inequality l26l may be used 

/(in, out) > //(in) + pc logj Pc + Pe loga [Pe) (5) 

where //(in) = —[p{o)\og2p{o)+p{e)\og2p{e)] (similar 
to (fT3]l). Likewise, an upper bound on the mutual information 
may be derived from the Hellman-Raviv inequality 11211 . l42l . 

C. Improvements 

While it has achieved unprecedented, error-free, online 
typing rates, our BCI-speller has not been optimized. For 
example, as the users underwent multiple experiments, they 
became familiar with the character layout, and felt that further 
reduction of post-selection pause (e.g. from 3 to 2 sec) would 
not compromise the spelling accuracy. This step alone would 
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have reduced the total spelUng times (see Table lIVI i by at 
least 43 sec, and increased the practical, error-free ITRs by 
at least 25%. Furthermore, addition of a full word completion 
feature similar to current text-messaging systems could further 
significantly increase the practical bit rates. Implementation 
of these improvements is straightforward, although some user 
training may be required. 

Through a more elaborate procedure, the performance of 
the BCI-speller with various omission and false alarm rates 
could be simulated. This would allow the costs Aqm and Afa 
(see Section III-DI ) to be evaluated more objectively and the 
classification threshold in ([B to be chosen optimally. The 
optimal threshold is likely to be interface-speed dependent, 
as the two costs scale differently with the interface speed 
due to the fact that each false alarm incurs a constant cost 
associated with the post-selection pause. Another potential 
improvement could be achieved if in addition to training on 
stage- 1 data (see Section Ill-Cb . the parameters of CPCA, 
AIDA and the Bayesian classifier were estimated from stage- 
2 data. Together with more objectively estimated Aqm and 
Afa, this would allow the stage-2 threshold to be set more 
accurately (as opposed to the empirically chosen 6 = 0.5). 
In addition, fine tuning of ITI and increasing the number of 
channels, especially over the parietal lobe ifTSI . could conceiv- 
ably improve the ITRs even further Finally, optimization of 
luminance P3l . background/foreground color, and character 
size and spacing ll44l . may lead to further improvements. 

D. Conclusion 

By exploiting basic concepts from pattern recognition the- 
ory and information theory, the presented EEG-based BCI 
communication system allows for error-free selection of char- 
acters with sustained, online bit rates that are several-fold 
higher than those that have been achieved with similar BCI 
systems. More importantly, these results disprove the common 
assumption that ITRs of EEG-based BCI systems are limited 
to ~1 bit/sec im, ll34l . Since the parameters of the present 
system have not been optimized, we hypothesize that further 
substantial improvements of both character-selection and prac- 
tical ITRs can be achieved. Many of these improvements are 
straightforward, while others may require some user training. 
These results may have significant implications on the viability 
and adoption of EEG-based BCIs in both clinical and non- 
clinical applications. They also offer compelling evidence for 
further development of state-of-the-art statistical signal pro- 
cessing and pattern recognition methods aimed at the single- 
trial processing and analysis of high-dimensional EEG data. 

Appendix A 
Group Randomization of Characters 

During BCI operation, the characters are highlighted in 
a random order, biased to the frequency of the English 
language alphabet, digits, and punctuations (see Fig. ^ 1451 . 
Il46l . Before each stage- 1 cycle, the order of characters is 
re-randomized in an iterative process by using the inverse 
sampling theorem BtI . To this end, a cumulative distribution 
function (CDF) Fx{x), x = {>,E, T, •••}, is calculated by 
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>ETA0I<NSHRDLCUMW*FGYPBV1KJXQZ?, ! 367802459 

Fig. 4. The frequency distribution of the characters in the BCI speller. 

integrating the character histogram. The inverse sampling theo- 
rem states that if y is a uniformly distributed random variable, 
then the CDF of X = F^i^) is precisely Fx, and so uniform 
distribution can be mapped into any distribution whose Fx 
is known. To order characters according to their frequencies, 
at each iteration a uniformly sampled random number, y* ~ 
Z-/[0,1], was mapped according to x* = F^[y*\ and the 
character corresponding to was drawn without replacement. 
Since Fx is discontinuous, the mapping F^ is implemented 
as a lookup table. This procedure is then iterated until all 42 
characters have been drawn. The characters are subsequently 
organized into 7 groups of 6 characters according to the order 
in which they were drawn. Using this approach, the more 
frequently used characters, e.g. {>, E, T, A, 0, l} are likely 
to be highlighted earlier in a cycle. For example, Monte Carlo 
simulations show that the average number of illuminations 
necessary to highlight > (the most frequent character) is only 
1.6 (out of 7). Note, however, that since the above algorithm 
is stochastic, the order of illumination varies over cycles, 
which prevents the formation of predictable spatio-temporal 
illumination patterns that are known to weaken the P300 
response. Finally, in stage 2, the characters are illuminated 
one-by-one in the same biased random order as in stage 1. 

Appendix B 
Classifier Design 

In Bayesian decision theory i23|, the loss associated with 
each incorrect decision is expressed by: 

7e(o|F*) = AFAP(e|F*) (6) 
7^(e|F*) = AoM^oli^*) (7) 

where 7?,(o | F*) is the (conditional) risk of classifying obser- 
vation F* as oddball. It is the product of the probability that 
the identity of F* is not oddball, i.e. p(e | F*), and the cost 
of such decision is Afa- The risk associated with classifying 
F* as non-oddball, 7?,(e | F*), is defined in a similar manner 
The overall risk is then defined as: 

\ [n{o\F)+n{e\F)]f{F)dF (8) 

where /(F) is the probability density function (PDF) of 
features, F. The minimum value of dHJ is known as the Bayes 
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risk, and the decision rule that accomplishes this optimum is: 



p(d\o) 



n{e\F*) > n{o\F*) VF* e n 



(9) 



i.e. make a decision that carries the smaller risk. The Bayesian 
classifier ([TJ then follows readily from (|9}. 

To estimate its parameters, the Bayes theorem is used: 

,f(F I o)p{o) 



p{o\F) = 



f{F) 



where p{o) is the prior probability of oddball trials and 
f{F\o) is the conditional PDF of features under the oddball 
class. Similar expression is derived for p{e \ F). The Bayesian 
classifier ^ is then implemented as a likelihood ratio test: 



f{F*\o) o AFAp(e) 
< 



AoM p(o) 



The parameters of the linear Bayesian classifier are estimated 
by assuming features that are conditionally Gaussian, i.e. 
F\o ^ Af(po, <5'^) and F \ e ^ A/'(/ie, ct^), where po and fi^ 
are conditional sample means of features under oddball and 
non-oddball classes, respectively, and is the (unconditional) 
sample variance of features ||231 . 

We conclude by noting that if Afa = Aqm and the features 
are non-informative, i.e. f{F\o) = f{F\e), the classifier's 
decision ( fTOl i hinges solely upon the ratio of p{e) and p{o). 
Since oddball experimental paradigms imply p{e) > p{o), 
the Bayesian classifier (fTol i will always pick the non-oddball 
class in this case. This decision rule defines the chance-level 
performance of the Bayesian classifier (see Appendix 0. 

Appendix C 
ITR Calculation 

Fig. |5] shows the schematic of an asymmetric binary commu- 
nication channel with inputs: o (intent to select the highlighted 
character-oddball) and e (intent not to select the highlighted 
character-non-oddball), and outputs: o (oddball detected) and 
e (non-oddball detected). The transition probabilities between 
inputs and outputs can be estimated by the 10-fold CV (see 
Section III-Db and comparing the true and decoded identities 
of test trials. This procedure yields a confusion matrix: 



C 



M 



(11) 



p{d I o) p{e I o) 
p{d I e) p{e I e) 

where p(6 | o) and p{e \ e) are the fraction of correctly de- 
coded oddball (non-oddball) trials, respectively. Note that 
p{e I o) = 1 — p(6 I o) represents the probability of omission 
and p{d | e) = 1 — p{e \ e) is the probability of a false alarm. 
The probability of error (misclassification) is then defined as: 
Pe — p{e\o)p{o) + p{d\e)p{e). Finally, the probabiUty of 
correct classification is defined as: pc = I — Pe- 
By the law of total probability, we have: 



p{d) = p{d I o)p{o) + p{6 I e)p{e) 
p{e) = p{e I o)p(o) + p{e \ e)p{e) 



(12) 



where p{d) and p{e) are simply the (unconditional) probabili- 
ties of decoded trials. Since the output of the communication 
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p(e|e) 

Fig. 5. BCI system as a noisy communication cliannel. 

channel takes the values o and e, then its unconditional 
entropy 251 is given by: 

if (out) = - [p(6)log2p(o) +p(e)log2p(e)] (13) 
To complete the calculation of the mutual information (|2]i, we 



^Q-j estimate the conditional entropy, _ff (out | in), as in ||26|: 



H{ont I in) = iJ(out | in = o)p{o) + H{ont \ in = e)p(e) 
= - [p{o I o) log2 p{d I o) + p{e I o) log2 p{e \ o)] p{o) 
- [p{d I e) log2 p{d I e) + p{e \ e) log2 p{e \ e)] p{e) (14) 

Thus, the mutual information can be calculated by subtract- 
ing ( fT4] i from (fT3] l. It can also be shown that unless a 
communication channel is symmetric, i.e. p{e \ o) = p{d \ e), 
/(in, out) cannot be expressed as a function p^ and pc. 

If observed features, F*, carry no class-relevant informa- 
tion, the Bayesian classifier (fTOl i will assign them to the 
more numerous class (non-oddball), regardless of their class 
identities (see Appendix|B]). The confusion matrix entries then 
become: p(6 | o) = p(6 | e) = and p{e \ 6) = p{e | e) = 1, 
and the chance-level performance of the Bayesian classifier 
is: Pe — p{o) and pc — p(e). Assuming the oddball to non- 
oddball ratio of 1 :6, we arrive at the chance performance: pc = 
85.71%. Under this condition, it readily follows from (IT?t 
and ( fT4l i (after using FHopital's rule) that iJ(out) = and 
i/(out I in) = 0. Therefore, the chance-level performance 
indeed yields /(in, out) = 0. 

In the case of the perfect classifier, we have p(6 | o) = 
p(e I e) = 1, and so from (fTZt we obtain p(6) = p(o) and 
p(e) ~ p(e). It follows immediately that //(out | in) — 0, 
since the output is no longer considered random, once the input 
is known. Assuming the same 1:6 oddball-to-non-oddball ratio, 
we obtain: /(out, in) = //(out) = 0.592 bit/transmission, 
which is the theoretical maximum of this channel. 

References 

[1] N. Birbaumer, N. Ghanayim, T. Hinterberger, I. Iversen, B. Kotclioubey, 

A. Kiibler, J. Perelmouter, E. Taub, and H. Flor. A spelling device for 

the paralysed. Nature, 398(6725):297-298, 1999. 
[2] J.R. Wolpaw, N. Birbaumer, D.J. McFarland, G. Pfurtscheller, and T.M. 

Vaughan. Brain-computer interfaces for communication and control. 

Clin Neumphysiol, 1 13(6):767-791, 2002. 
[3] J.R. Wolpaw and D.J. McFarland. Control of a two-dimensional 

movement signal by a noninvasive brain-computer interface in humans. 

Proc Natl Acad Sci USA, 101(51): 17849-17854, 2004. 
[4] D.J. McFarland, W.A. Saniacki, and J.R. Wolpaw. Electroencephalo- 

graphic (EEG) control of three-dimensional movement. J Neural Eng, 

7(3):036007, 2010. 

[5] R. Leeb, D. Friedman, G.R. Miiller-Putz, R. Scherer, M. Slater, and 
G. Pfurtscheller. Self-paced (asynchronous) BCI control of a wheelchair 
in virtual environments: a case study with a tetraplegic. Comput Intell 
Neurosci, page 79642, 2007. 



9 



[6] P.T. Wang, C.E. King, L.A. Chui, Z. Nenadic, and A.H. Do. BCI 
controlled walking simulator for a BCI driven FES device. In Proc. 
of RESNA Annual Conference, Las Vegas, Nevada, 2010. 

[7] A.H. Do, P.T. Wang, A. Abiri, C.E. King, and Z. Nenadic. Brain- 
computer interface controlled functional electrical stimulation system 
for ankle movement. J Neuroeng Rehabil, 8:49, 2011. 

[8] G.R. Miiller-Putz, R. Scherer, G. Pfurtscheller, and R. Rupp. EEG-based 
neuroprosthesis control: a step towards clinical practice. Neiirosci Lett, 
382(1-2): 169-174, 2005. 

[9] G. Pfuitscheller, G.R. Muller, J. Pfurtscheller, H.J. Gemer, and R. Rupp. 
'Thought'-control of functional electrical stimulation to restore hand 
grasp in a patient with tetraplegia. Nenrosci Lett, 351(l):33-36, 2003. 
[10] L.A. Farwell and E. Donchin. Talking off the top of your head: toward 
a mental prosthesis utilizing event-related brain potentials. Electroen- 
cephalogr Clin Neumphysiol, 70(6):5 10-523, 1988. 
[11] S. Sutton, M. Braren, J. Zubin, and E.R. John. Evoked-potential 
con'elates of stimulus uncertainty. Science, 150(700): 11 87-1 188, 1965. 
[12] K.C. Squires, S. Petuchowski, C. Wickens, and E. Donchin. The effects 
of stimulus sequence on event related potentials: a comparison of visual 
and auditory sequences. Perception & Psychophysics, 22(1):31^0, 
1977. 

[13] D.J. Kmsienski, E.W. Sellers, D.J. McFariand, T.M. Vaughan, and J.R. 

Wolpaw. Toward enhanced P300 speller performance. J Nenrosci 

Methods, 167(1):15-21, 2008. 
[14] G. Townsend, B.K. LaPallo, C.B. Boulay, D.J. Kmsienski, G.E. Frye, 

C.K. Hauser, N.E. Schwartz, T.M. Vaughan. J.R. Wolpaw, and E.W. 

Sellers. A novel P300-based brain-computer interface stimulus presen- 
tation paradigm: moving beyond rows and columns. Clin Neurophvsiol, 

121(7):1109-1120, 2010. 
[15] T.E. Hutchinson, K.R White, W.N. Martin, K.C. Reichert, and L.A. Frey. 

Human-computer interaction using eye-gaze input. IEEE Trans. Syst., 

Man, Cybern., 19(6): 1527-1534, 1989. 
[16] S. Thorpe, D. Fize, and C. Marlot. Speed of processing in the human 

visual system. Nature, 381(6582):520-522, 1996. 
[17] K. Das and Z. Nenadic. An efficient discriminant-based solution for 

small sample size problem. Pattern Recogn, 42(5):857-866, 2009. 
[18] K. Das, S. Osechinskiy, and Z. Nenadic. A classwise PCA-based 

recognition of neural data for brain-computer interfaces. In Proc. 29th 

Annual Int. Conf. of the IEEE Engineering in Medicine and Biology 

Society, pages 6520-6523, 2007. 
[19] K. Das and Z. Nenadic. Approximate information disciiminant analysis: 

a computationally simple heteroscedastic feature extraction technique. 

Pattern Recogn, 41(5): 1548-1557, 2008. 
[20] S.T. Roweis and L.K. Saul. Nonlinear dimensionality reduction by 

locally linear embedding. Science, 290(5500):2323-2326, 2000. 
[21] Z. Nenadic. Information discriminant analysis: Feature extraction with 

an infomiation-theoretic objective. IEEE Trans. Pattern Anal. Much. 

Intell, 29(8): 1394-1407, 2007. 
[22] K. Torkkola. Feature extraction by non-paramatric mutual information 

maximization. / Mach. Learn. Res., 3:1415-1438, 2003. 
[23] R.O. Duda, RE. Hart, and D.G. Stork. Pattern Classification. Wiley- 

Interscience, New York, 2001. 
[24] Z. Nenadic and J.W. Burdick. Spike detection using the continuous 

wavelet transform. IEEE T. Biomed. Eng., 52(l):74-87, 2005. 
[25] R. Kohavi. A study of cross-validation and bootstrap for accuracy 

estimation and model selection. In Int. Joint C. Art. Int., pages 1137- 

1145, 1995. 

[26] T.M. Cover and J. A. Thomas. Elements of Information Theory. Wiley 
Interscience, New York, 1991. 

[27] E.W. Sellers, D.J. Krusienski, D.J. McFariand, and J.R. Wolpaw. Non- 
invasive brain-computer interface research at the Wadsworth Center. In 
G. Domhege. J.R. Millan, T. Hinterberger, D.J. McFariand, and K.-R. 
Miiller, editors. Toward Brain-Computer Interfacing, pages 31^2. The 
MIT Press, 2007. 

[28] P.L. Nunez and R. Srinivasan. Electric Fields of the Brain: The 
Neurophysics of EEC Oxford University Press, New York, 2nd edition, 
2006. 

[29] C.L. Gonsalvez and J. Polich. P300 amplitude is determined by target- 
to-target interval. Psychophysiology, 39(3):388-396, 2002. 

[30] M. Kaper, P. Meinicke, U. Grossekathoefer, T. Lingner, and H. Ritter. 
BCI competition 2003-data set lib: support vector machines for the P300 
speller paradigm. IEEE Trans. Biomed. Eng., 51(6):1073-1076. 

[31] P. Meinicke, M. Kaper, F. Hoppe, M. Heumann, and H. Ritter. Improving 
transfer rates in brain computer interfacing: A case study. In NIPS, pages 
1107-1114, 2002. 



[32] H. Serby, E. Yom-Tov, and G.E. Inbar. An improved P300-based brain- 
computer interface. IEEE Trans. Neural Syst. Rehabil. Eng., 13(1):89- 
98, 2005. 

[33] C. Guan, M. Thulasidas, and J. Wu. High performance P300 speller 
for brain-computer interface. In IEEE International Workshop on 
Biomedical Circuits and Systems. IEEE International Workshop, 2004. 

[34] G. Santhanam, S.I. Ryu, B.M. Yu, A. Afshar, and K.V. Shenoy. A 
high-perfbmiance brain-computer interface. Nature, 442(7099): 195- 
198, 2006. 

[35] E.W. Sellers, D.J. Kmsienski, D.J. McFariand, T.M. Vaughan, and J.R. 
Wolpaw. A P300 event-related potential brain-computer interface (BCI): 
the effects of matrix size and inter stimulus interval on performance. Biol 
Psychol, 73(3):242-252, 2006. 

[36] K. Das. D.S. Rizzuto, and Z. Nenadic. Mental state estimation for brain- 
computer interfaces. IEEE Trans. Biomed. Eng., 56(8):21 14— 2122, 2009. 

[37] A.H. Do, RT. Wang, C.E. King, L.A. Chui, and Z. Nenadic. Asyn- 
chronous BCI control of a walking simulator. In the Fourth International 
BCI Meeting, Asilomar, CA, June, 2010. 

[38] C.E. King, RT. Wang. M. Mizuta, D.J. Reinkensmeyer, A.H. Do, 
S. Moromugi, and Z. Nenadic. Noninvasive brain-computer interface 
driven hand orthosis. In Proc. 33rd Annual Int. Conf. of the IEEE 
Engineering in Medicine and Biology Society, pages 5786-5789, 2011. 

[39] J.R. Pierce. An introduction to information theory: symbols, signals, 
and noise. Dover Publications, Inc., New York, 2nd edition, 1980. 

[40] C.S. Nam, Y. Li, and S. Johnson. Evaluation of a P300-based brain- 
computer interface in real-world contexts. Int J Hiun-Comput Int, 
26(6):62 1-637, 2010. 

[41] J.R. Wolpaw, N. Birbaumer, W.J. Heetderks, D.J. McFaiJand, PH. 
Peckham, G. Schalk, E. Donchin. L.A. Quatrano, C.J. Robinson, and 
T.M. Vaughan. Brain-computer interface technology: a review of the 
first international meeting. IEEE Trans. Rehabil. Eng., 8(2):164— 173, 
2000. 

[42] M. Hellman and J. Raviv. Probability of eiTor, equivocation, and the 

Chemoff bound. IEEE Trans. Inf Theory, 16(4):368-372, 1970. 
[43] K. Takano, T. Komatsu, N. Hata, Y. Nakajima, and K. Kansaku. 

Visual stimuli for the P300 brain-computer interface: a comparison 

of white/gray and green/blue flicker matrices. Clin Neurophysiol, 

120(8): 1562-1566, 2009. 
[44] M. Salvaris and F. Sepulveda. Visual modifications on the P300 speller 

BCI paradigm. J Neural Eng, 6(4):046011, 2009. 
[45] E. Stewart Lee. Essays about computer security, page 181. University 

of Cambridge Computer Laboratory, 1999. 
[46] R. Lewand. Cryptological mathematics, page 36. The Mathematical 

Association of America, 2000. 
[47] L. Devroye. Non-Uniform Random Variate Generation, chapter 2, pages 

27-39. Springer-Veriag, New York, 1986. 



